农场动物成像的各种应用基于某些身体部位的重量和从动物的CT图像切割的估计。在许多情况下,由于扫描非镇静的活动物,通过CT图像中的姿势的巨大变化来增加问题的复杂性。在本文中,我们提出了一种估计来自(可能)活体动物的CT图像的切割和身体部位的重量的一般和鲁棒方法。我们通过弹性登记和联合功能和用于斗篷的回归分量的模型选择,适应基于多标准的分段以及具有大量特征和较少量的样本。通过兔育种程序中的真实应用来评估和说明所提出的技术,显示R ^ 2比以前的技术和方法高于以前的技术和方法。所提出的技术很容易适应类似的问题,因此,它在开源软件包中共享,以便为社区的利益。
translated by 谷歌翻译
在过去的15年中,视网膜图像中的船只的分割已成为医学成像中的强烈研究问题,其中数百种算法发布。血管分割技术的DE事实上基准数据集之一是驱动数据集。由于驱动器包含训练和测试图像的预定义分割,因此各种分段技术的公布性能结果应提供算法的可靠排名。在该研究中包括超过100篇论文,我们对公布性能分数的一致性进行了详细的数值分析。我们发现与使用视野(FOV)相关的报告分数不一致,这对性能分数产生了重大影响。我们试图消除使用数值技术来提供偏差,以提供最逼真的现实情况。根据结果​​,我们制定了几种调查结果,最值得注意的是:尽管有明确定义的试验集,所公布论文中的大多数排名都基于非比较的数字;与文献中报告的近乎完善的准确度分数相反,迄今为止所达到的最高精度分数在FOV区域中为0.9582,比人类注释器高出1%。我们开发用于识别和消除评估偏差的方法可以很容易地应用于可能出现类似问题的其他域。
translated by 谷歌翻译
We study the ability of foundation models to learn representations for classification that are transferable to new, unseen classes. Recent results in the literature show that representations learned by a single classifier over many classes are competitive on few-shot learning problems with representations learned by special-purpose algorithms designed for such problems. We offer an explanation for this phenomenon based on the concept of class-features variability collapse, which refers to the training dynamics of deep classification networks where the feature embeddings of samples belonging to the same class tend to concentrate around their class means. More specifically, we examine the few-shot error of the learned feature map, which is the classification error of the nearest class-center classifier using centers learned from a small number of random samples from each class. Assuming that the classes appearing in the data are selected independently from a distribution, we show that the few-shot error generalizes from the training data to unseen test data, and we provide an upper bound on the expected few-shot error for new classes (selected from the same distribution) using the average few-shot error for the source classes. Additionally, we show that the few-shot error on the training data can be upper bounded using the degree of class-features variability collapse. This suggests that foundation models can provide feature maps that are transferable to new downstream tasks even with limited data available.
translated by 谷歌翻译
We study the learning dynamics of self-predictive learning for reinforcement learning, a family of algorithms that learn representations by minimizing the prediction error of their own future latent representations. Despite its recent empirical success, such algorithms have an apparent defect: trivial representations (such as constants) minimize the prediction error, yet it is obviously undesirable to converge to such solutions. Our central insight is that careful designs of the optimization dynamics are critical to learning meaningful representations. We identify that a faster paced optimization of the predictor and semi-gradient updates on the representation, are crucial to preventing the representation collapse. Then in an idealized setup, we show self-predictive learning dynamics carries out spectral decomposition on the state transition matrix, effectively capturing information of the transition dynamics. Building on the theoretical insights, we propose bidirectional self-predictive learning, a novel self-predictive algorithm that learns two representations simultaneously. We examine the robustness of our theoretical insights with a number of small-scale experiments and showcase the promise of the novel representation learning algorithm with large-scale experiments.
translated by 谷歌翻译
Property inference attacks against machine learning (ML) models aim to infer properties of the training data that are unrelated to the primary task of the model, and have so far been formulated as binary decision problems, i.e., whether or not the training data have a certain property. However, in industrial and healthcare applications, the proportion of labels in the training data is quite often also considered sensitive information. In this paper we introduce a new type of property inference attack that unlike binary decision problems in literature, aim at inferring the class label distribution of the training data from parameters of ML classifier models. We propose a method based on \emph{shadow training} and a \emph{meta-classifier} trained on the parameters of the shadow classifiers augmented with the accuracy of the classifiers on auxiliary data. We evaluate the proposed approach for ML classifiers with fully connected neural network architectures. We find that the proposed \emph{meta-classifier} attack provides a maximum relative improvement of $52\%$ over state of the art.
translated by 谷歌翻译
Data-driven interatomic potentials have emerged as a powerful class of surrogate models for {\it ab initio} potential energy surfaces that are able to reliably predict macroscopic properties with experimental accuracy. In generating accurate and transferable potentials the most time-consuming and arguably most important task is generating the training set, which still requires significant expert user input. To accelerate this process, this work presents \text{\it hyperactive learning} (HAL), a framework for formulating an accelerated sampling algorithm specifically for the task of training database generation. The key idea is to start from a physically motivated sampler (e.g., molecular dynamics) and add a biasing term that drives the system towards high uncertainty and thus to unseen training configurations. Building on this framework, general protocols for building training databases for alloys and polymers leveraging the HAL framework will be presented. For alloys, ACE potentials for AlSi10 are created by fitting to a minimal HAL-generated database containing 88 configurations (32 atoms each) with fast evaluation times of <100 microsecond/atom/cpu-core. These potentials are demonstrated to predict the melting temperature with excellent accuracy. For polymers, a HAL database is built using ACE, able to determine the density of a long polyethylene glycol (PEG) polymer formed of 200 monomer units with experimental accuracy by only fitting to small isolated PEG polymers with sizes ranging from 2 to 32.
translated by 谷歌翻译
Density based representations of atomic environments that are invariant under Euclidean symmetries have become a widely used tool in the machine learning of interatomic potentials, broader data-driven atomistic modelling and the visualisation and analysis of materials datasets.The standard mechanism used to incorporate chemical element information is to create separate densities for each element and form tensor products between them. This leads to a steep scaling in the size of the representation as the number of elements increases. Graph neural networks, which do not explicitly use density representations, escape this scaling by mapping the chemical element information into a fixed dimensional space in a learnable way. We recast this approach as tensor factorisation by exploiting the tensor structure of standard neighbour density based descriptors. In doing so, we form compact tensor-reduced representations whose size does not depend on the number of chemical elements, but remain systematically convergeable and are therefore applicable to a wide range of data analysis and regression tasks.
translated by 谷歌翻译
作为人类已知的最直观的界面之一,自然语言有可能调解许多涉及人类计算机互动的任务,尤其是在音乐信息检索等以应用程序为中心的领域。在这项工作中,我们探索了跨模式学习,以试图在音乐领域弥合音频和语言。为此,我们提出了Muscall,这是音乐对比的音频学习框架。我们的方法由双重编码架构组成,该体系结构了解音乐音频和描述性句子对之间的对齐方式,生成可用于文本到原告和音频到文本检索的多模式嵌入。多亏了这个属性,肌肉几乎可以转移到任何可以作为基于文本检索的任务转移到任何任务。我们的实验表明,我们的方法在检索音频时的性能要比基线要好得多,该音频与文本描述匹配,相反,与音频查询匹配的文本。我们还证明,我们的模型的多模式对齐能力可以成功扩展到零摄像转移方案,用于流派分类和在两个公共数据集上自动标记。
translated by 谷歌翻译
在计算化学和材料科学中,创建快速准确的力场是一项长期挑战。最近,已经证明,几个直径传递神经网络(MPNN)超过了使用其他方法在准确性方面构建的模型。但是,大多数MPNN的计算成本高和可伸缩性差。我们建议出现这些局限性,因为MPNN仅传递两体消息,从而导致层数与网络的表达性之间的直接关系。在这项工作中,我们介绍了MACE,这是一种使用更高的车身订单消息的新型MPNN模型。特别是,我们表明,使用四体消息将所需的消息传递迭代数减少到\ emph {两},从而导致快速且高度可行的模型,达到或超过RMD17的最新准确性,3BPA和ACAC基准任务。我们还证明,使用高阶消息会导致学习曲线的陡峭程度改善。
translated by 谷歌翻译
在本文中,引入了一种新颖的解决方案,用于由深度学习组件构建的视觉同时定位和映射(VSLAM)。所提出的体系结构是一个高度模块化的框架,在该框架中,每个组件在基于视觉的深度学习解决方案的领域中提供了最新的最新技术。该论文表明,通过这些单个构建基块的协同整合,可以创建一个功能高效,有效的全直神经(ATDN)VSLAM系统。引入了嵌入距离损耗函数并使用ATDN体系结构进行了训练。最终的系统在Kitti数据集的子集上设法实现了4.4%的翻译和0.0176 ver/m的旋转误差。所提出的体系结构可用于有效,低延迟的自主驾驶(AD)协助数据库创建以及自动驾驶汽车(AV)控制的基础。
translated by 谷歌翻译